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DETAILED ACTION 

1 . This communication is responsive to the Amendment filed 20 November 2006. 

2. Claims 1-25 are pending in this application. Claims 1 and 14 are independent. 
In the Amendment filed 20 November 2006, claims 1-4 and 14-16 have been amended. 
This action is made Non-Final. 

iMsB The rejections of claims 1 and 6 as being anticipated by the article "IntelliClean: 
A Knowledge-Based Intelligent Data Cleaner" by Lee et al; claims 2-4 as being 
unpatentable over the article "IntelliClean: A Knowledge-Based Intelligent Data Cleaner" 
by Lee et al in view of the article "Mining Generalized Patterns from Web Logs" by Ling 
et al; claims 14-16 and 18 as being unpatentable over the article "IntelliClean: A 
Knowledge-Based Intelligent Data Cleaner" by Lee et al in view of the article "Mining 
Generalized Patterns from Web Logs" by Ling et al; claims 5 and 17 as being 
unpatentable over the article "IntelliClean: A Knowledge-Based Intelligent Data Cleaner" 
by Lee et al in view of the article "Better Rules, Fewer Features: A Semantic Approach 
to Selecting Features from Text" by Blake et al; claims 7-8 and 19-20 as being 
unpatentable over the article "IntelliClean: A Knowledge-Based Intelligent Data Cleaner" 
by Lee et al in view of the article "Faster Algorithm of String Comparison" by Yang et al; 
and claims 9-13 and 21-25 as being unpatentable over the article "IntelliClean: A 
Knowledge-Based Intelligent Data Cleaner" by Lee et al in view of the article "From 
Data Mining to Knowledge Discovery in Databases" by Fayyad et al have been 
withdrawn as necessitated by amendment. 
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Claim Rejections - 35 (JSC § 101 

35 U.S.C. 101 reads as follows: 

Whoever invents or discovers any new and useful process, machine, manufacture, or composition of 
matter, or any new and useful improvement thereof, may obtain a patent therefor, subject to the 
conditions and requirements of this title. 

4. Claims 1-13 are rejected under 35 U.S.G. 101 because the claimed invention is 

directed to non-statutory subject matter. 

Claim 1 recites a method of compressing a log of linguistic data, the log having a 
plurality of linguistic help query strings, each string being including at least two tokens, 
the method comprising: applying a compression operation to each string; determining if 
any two strings match each other after the compression operation; and removing one of 
the two matching strings from the log. 

If the two strings match each other, then one of the matching strings are removed 
from the log. However, if it is determined that none of the strings match each other, the 
result of the method is unclear. Therefore, the claimed subject matter lacks a practical 
application of a judicial law exception (law of nature, abstract idea, naturally occurring 
article/phenomenon) since it fails to produce a useful, concrete and tangible result. 

Specifically, the claimed subject matter does not produce a tangible result 
because the claimed subject matter fails to produce a result that is limited to having real 
world value rather than a result that may be interpreted to be abstract in nature as, for 
example, a thought, a computation, or manipulated data. More specifically, the claimed 
subject matter provides for removing one of the two matching strings form the log only 
when it is determined that the two strings match. 
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Claims 2-13 are dependent on claim 1 and therefore are rejected on the same 
grounds as claim 1. 

To allow for compact prosecution, the examiner will apply prior art to these 
claims as best understood, with the assumption that applicant will amend to overcome 
the stated 101 rejections. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

This application currently names joint inventors. In considering patentability of 

the claims under 35 U.S.C. 103(a), the examiner presumes that the subject matter of 

the various claims was commonly owned at the time any inventions covered therein 

were made absent any evidence to the contrary. Applicant is advised of the obligation 

under 37 CFR 1 .56 to point out the inventor and invention dates of each claim that was 

not commonly owned at the time a later invention was made in order for the examiner to 

consider the applicability of 35 U.S.C. 1 03(c) and potential 35 U.S.C. 1 02(e), (f) or (g) 

prior art under 35 U.S.C. 1 03(a). 

5. Claims 1-24 are rejected under 35 U.S.C. 103(a) as being unpatentable over 
US Patent No. 6,584,464 to Warthen (hereafter Warthen) in view of US PGPub 
2004/0199498 to Kapur et al (hereafter Kapur). 
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Referring to claim 1, Warthen discloses a method of compressing a log of 
linguistic data (see column 4, lines 65-66), the log having a plurality of linguistic help 
query strings [questions] (see column 4, lines 38-41 ), each string being including at 
least two tokens [i.e., Where can I find information on the sport bicycling?] (see column 
4, lines 32-36). However, Warthen fails to explicitly disclose the further limitations of the 
actual steps taken to compress the log. Kapur discloses receiving various query log 
files from various sources and then processing the logs (see [0035], lines 1-8), including 
the further limitations of: 

applying a compression operation [canonicalized] to each string (see [0036], 
lines 3-5); 

determining if any two strings match each other after the compression operation 
[consolidate] (see Fig 5, step 510); and 

removing one of the two matching strings from the log [multiple occurrences of 
the same query are included as a single query] (see [0035], lines 10-13). 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to use the query processing engine disclosed by Kapur to compress the log of 
questions disclosed by Warthen. One would have been motivated to do so in order to 
produce a set of questions which improve the process of determining the context of a 
user query and then associating the most useful result with the query (Warthen: see 
column 1, lines 43-51) in order to produce a set of questions which improve the process 
of determining the context of a user query and then associating the most useful result 
with the query. 
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Referring to claim 2, the combination of Warthen and Kapur (hereafter 
Warthen/Kapur) discloses the method of claim 1 , wherein the log is a log of user- 
initiated inputs [users' questions] to a help interface [client interface] (Warthen: see 
column 3, lines 59-67). 

Referring to claim 3, Warthen/Kapur discloses the method of claim 2, wherein 
each string is a query relative to a help function (Warthen: see column 3, lines 59-67). 

Referring to claim 4, Warthen/Kapur discloses the method of claim 3, wherein 
each help-related query is relative to a computer system [corporate network answering 
employee questions] (Warthen: see column 3, lines 59-67). 

Referring to claim 5, Warthen/Kapur discloses the method of claim 1 , wherein 
the compression operation is character-based [removing odd symbols] (Kapur: see 
[0039]-[0048]). 

Referring to claim 6, Warthen/Kapur discloses the method of claim 1 , wherein 
the compression operation is token-based (Kapur: see [0036], lines 23-28). 

Referring to claim 7, Warthen/Kapur discloses the method of claim 1 , wherein 
the compression operation is subsumption (Kapur: see [0039]-[0048]). 

Referring to claim 8, Warthen/Kapur discloses the method of claim 7, wherein 
subsumption includes applying an impossibility condition to selectively compute edit 
distance [edit distance d] (Kapur: see [0048], lines 13-31). 



Application/Control Number: 10/796,644 Page 7 

Art Unit: 2167 

Referring to claim 9, Warthen/Kapur discloses the method of claim 1 , and 
further comprising: 

applying a second compression operation [tokenized] to each string (Kapur: see 
[0036], lines 23-28); 

determining if any two strings match each other after the second compression 
operation [convergence] (Kapur: see [0038], lines 1-2); and 

removing one of the two matching strings from the log (see [0038], lines 5-6). 

Referring to claim 10, Warthen/Kapur discloses the method of claim 9, wherein 
the first compression operation is character-based [canonicalize - item 500] and the 
second compression operation is token based [tokenize - item 520] (Kapur: see Fig 5). 

Referring to claim 11, Warthen/Kapur discloses the method of claim 10, and 
further comprising applying subsumption [Generation of Extensions, Associations and 
Alternatives - items 570, 580 and 590] after the second compression operation 
[tokenize] is complete (Kapur: see Fig 5). 

Referring to claim 12, Warthen/Kapur discloses the method of claim 1 1 , 
wherein the subsumption operation is repeated for the log [the log would be processed 
by the processing engine one more time] (Kapur: see [0035], lines 3-8). 

Referring to claim 13, Warthen/Kapur discloses the method of claim 1 , and 
further comprising training a statistical process with the compressed log (Kapur: see 
[0008]). 

Referring to claim 14, Warthen discloses a system for compressing a query log 
having a plurality of linguistic help query strings [questions] (see coiumn 4, iines 38-41 
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and 65-66), each string having a plurality of tokens [i.e., Where can I find information on 
the sport bicycling?] (see column 4, lines 32-36). However, Warthen fails to explicitly 
disclose the further limitations of the actual steps taken to compress the log. Kapur 
discloses receiving various query log files from various sources and then processing the 
logs (see [0035], lines 1-8), including the further limitations of: 

an input for receiving a raw query log (see [0035], lines 3-8); 

memory [memory or database file 310] for storing the raw query log (see [0035], 
lines 19-31); 

a processor [query processing engine 300] (see Fig 3) for applying at least one 
compression operation [canonicalized] to each string (see [0036], lines 3-5), and for 
scanning the modified strings to determine if any match each other [consolidate] (see 
Fig 5, step 510) so that one of the matching strings can be removed (see [0035], lines 
10-13); and 

an output for providing a compressed query log once the removal is complete 
(see [0025]) in order to produce a set of questions which improve the process of 
determining the context of a user query and then associating the most useful result with 
the query. 

It would have been obvious to one of ordinary skill in the art at the time of the 
invention to use the query processing engine disclosed by Kapur to compress the log of 
questions disclosed by Warthen. One would have been motivated to do so in order to 
produce a set of questions which improve the process of determining the context of a 
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user query and then associating the most useful result with the query (Warthen: see 
column 1, lines 43-51). 

Referring to claim 15, Warthen/Kapur discloses the system of claim 14, wherein 
each string is a query relative to a help function (Warthen: see column 3, lines 59-67). 

Referring to claim 16, Warthen/Kapur discloses the system of claim 15, wherein 
each help-related query is relative to a computer system [corporate network answering 
employee questions] (Warthen: see column 3, lines 59-67). 

Referring to claim 17, Warthen/Kapur discloses the system of claim 14, wherein 
the compression operation is character-based [removing odd symbols] (Kapur: see 
"[0039H0048]). 

Referring to claim 18, Warthen/Kapur discloses the system of claim 14, wherein 
the compression operation is token-based (Kapur: see [0036], lines 23-28). 

Referring to claim 19, Warthen/Kapur discloses the system of claim 14, wherein 
the compression operation is subsumption (Kapur: see [0039]-[0048]). 

Referring to claim 20, Warthen/Kapur discloses the system of claim 19, wherein 
subsumption includes applying an impossibility condition to selectively compute edit 
distance [edit distance d] (Kapur: see [0048], lines 13-31). 

Referring to claim 21, Warthen/Kapur discloses the system of claim 14, and 
further comprising: 

applying a second compression operation [tokenized] to each string (Kapur: see 
[0036], lines 23-28); 
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determining if any two strings match each other after the second compression 
operation [convergence] (Kapur: see [0038], lines 1-2); and 

removing one of the two matching strings from the log (see [0038], lines 5-6). 

Referring to claim 22, Warthen/Kapur discloses the system of claim 21 , wherein 
the first compression operation is character-based [canonicalize - item 500] and the 
second compression operation is token based [tokenize - item 520] (Kapur: see Fig 5). 

Referring to claim 23, Warthen/Kapur discloses the system of claim 22, and 
further comprising applying subsumption [Generation of Extensions, Associations and 
Alternatives - items 570, 580 and 590] after the second compression operation 
[tokenize] is complete (Kapur: see Fig 5). 

Referring to claim 24, Warthen/Kapur discloses the system of claim 23, wherein 
the subsumption operation is repeated for the log [the log would be processed by the 
processing engine one more time] (Kapur: see [0035], lines 3-8). 

Referring to claim 25, Warthen/Kapur discloses the system of claim 14, and 
further comprising training a statistical process With the compressed log (Kapur: see 
[0008]). 

Response to Arguments 

6. Applicant's arguments with respect to claims 1-25 have been considered but are 
moot in view of the new ground(s) of rejection. 
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examiner should be directed to Kimberly Lovel whose telephone number is (571) 272- 
2750. The examiner can normally be reached on 8:00 - 4:00. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, John Cottingham can be reached on (571) 272-7079. The fax phone 
number for the organization where this application or proceeding is assigned is 571- 
273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
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system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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